HTML Tags as Extraction Cues for Web Page Description Construction
نویسندگان
چکیده
منابع مشابه
HTML Tags as Extraction Cues for Web Page Description Construction
Using four previously identified samples of Web pages containing meta-tagged descriptions, the value of meta-tagged keywords, the first 200 characters of the body, and text marked with common HTML tags as extracts helpful for writing summaries was estimated by applying two measures: density of description words and density of two-word description phrases. Generally, titles and keywords showed t...
متن کاملHTML Page Analysis Based on Visual Cues
In this paper, we present a novel approach to automatically analyzing semantic structure of HTML pages based on detecting visual similarities of content objects on web pages. The approach is developed based on the observation that in most web pages, layout styles of subtitles or records of the same content category are consistent and there are apparent separation boundaries between different ca...
متن کاملEmbedding Secret Data in HTML Web Page
In this paper, we suggest a novel data hiding technique in an Html Web page. Html Tags are case insensitive and hence an alphabet in lowercase and one in uppercase present inside an html tag are interpreted in the same manner by the browser, i.e., change in case in an web page is imperceptible to the browser. We basically exploit this redundancy and use it to embed secret data inside an web pag...
متن کاملOn Reducing Dynamic Web Page Construction Times
Many web sites incorporate dynamic web pages to deliver customized contents to their users. However, dynamic pages result in increased user response times due to their construction overheads. In this paper, we consider mechanisms for reducing these overheads by utilizing the excess capacity with which web servers are typically provisioned. Specifically, we present a caching technique that integ...
متن کاملAutomatic Extraction of Generic Web Page Components
Information on the World Wide Web is accessed not just visually, but also automatically by systems, such as search engines and alternative browsers (e.g. screen readers and voice browsers), which extract and present relevant data automatically from Web pages. In most cases extraction cannot be performed directly, since HTML documents of today lack adequate semantic markup. This thesis proposes ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Informing Science: The International Journal of an Emerging Transdiscipline
سال: 2003
ISSN: 1547-9684,1521-4672
DOI: 10.28945/509